N87-2653 


PROGRAMMING A HILLSLOPE WATER MOVEMENT MODEL ON THE MPP 

J. E. Devaney, A. R. Irving 
Science Applications Research 
Lanham, Maryland 

P. J. Camillo, R. J. Gurney 
NASA Goddard Space Flight Center 
Greenbelt, Maryland 


ABSTRACT 

We have developed a physically based 
numerical model of heat and moisture 
flow within a hi 1 1 s 1 ope on a parallel 
architecture computer, as a precursor 
to a model of a complete catchment. 
Moisture flow within a catchment 
includes evaporation, overland flow, 
flow in unsaturated soil, and flow in 
saturated soil. Because of the 
empirical evidence that moisture flow 
in unsaturated soil is mainly in the 
vertical direction, flow in the 
unsaturated zone can be modelled as 
a series of one-dimensional columns. 
This initial version of the hillslope 
model includes evaporation and a single 
column of one-dimensional unsaturated 
zone flow. This case has already been 
solved on an IBM 3081 computer and is 
now being applied to the MPP architec- 
ture so as to make the extension to 
the one-dimensional case easier and to 
check the problems and benefits of 
using a parallel-architecture machine. 

Keywords: Hydrology, Hillslopes, 

Parallel Processing, Unsaturated Flow, 
Evaporation. 


INTRODUCTION 

One important part of the global 
hydrological system is a catchment, 
which separates rainfall into three 
parts: evaporation, overland flow, 

which goes directly to a stream, and 
infiltration, which flows at a much 
slower rate vertically through the 
soil to a saturated zone, where the 


water then flows horizontally and 
merges into the stream or is stored 
as groundwater. This involves four 
components: evaporation, overland 

flow, flow in unsaturated soil and 
flow in saturated soil. If we are to 
understand the way in which spatial 
variations of hydrological parameters 
affect the water balance of a catch- 
ment, including evaporation, runoff, 
erosion and transport of minerals, 
we have to model the flow over and 
within a hillside explicitly. 

Several workers have written hillslope 
models (Ref. 6). All of these have 
had limitations mainly because they 
could not be executed in a reasonable 
amount of time. This computer limita- 
tion is much reduced if a parallel 
processor is used, as it is possible 
to write the model in such a way that 
it can be solved in parallel at many 
points at one time. 

The first stage of the work is to 
verify that such a model may be 
efficiently executed on a parallel 
processor. To this end we have coded 
and tested a model which already exists 
(Ref. 2) on a serial computer, an 
IBM 3081, and which contains two of 
the four previously identified 
components, namely evaporation and 
unsaturated flow. We have transferred 
an existing model to the MPP so that 
we may verify the accuracy of the 
parallel computations. We also plan 
to estimate the computational 
efficiency of the full catchment model 
in a parallel machine over the 
equivalent model on a serial machine. 
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if such a serial model were to be 
successfully created. 

In this paper we discuss the physics 
of the one dimensional model, the 
method of solution, and its adapt- 
ability to the parallel architecture. 

Model Description 

The complete soil model is described 
in Ref. 2. The soil moisture profile 
in the unsaturated zone is the solution 
of the continuity equation 

90 9q m 

9t az 

where e(z,t) is the volumetric soil 
moisture (water volume/soil volume) 
at depth z at time t, and q m is the 
vertical soil moisture flux, modelled 
by (Ref. 8) 

90 

q m = K - D 0 (— -) (2) 

9z 


K is the hydraulic conductivity, and 
D 0 is the moisture diffusion 
coefficient, which depends on soil 
moisture 0 , the soil matric poten- 
tial and other physical constants. 
K and \j> are estimated with the para- 


meterization (Ref. 3) 


K ( 0 ) = K $ (0/0 s ) 2b+3 

(3a) 

+(e) = <M 0 / e s )" b 

(3b) 


in which 0 S , K s , and are moisture 
content, conductivity, and potential 
at saturation. The value of b depends 
on soil texture. 

The temperature profile in the soil 
may be modelled with Fourier's 
equations and is one option in the 
serial version. However, we have 
chosen to implement the computationally 
simpler yet physically adequate 
force-restore equations (Ref. 7). The 


surface and deep soil temperatures T s 
and T are modelled by 
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where G is the soil surface heat flux, 
X and c are respectively the soil 
thermal conductivity and heat capacity 
and t is the length of the day. T is 
the temperature at the depth where 
fluctuations are seasonal rather than 


diurnal. For most soils this depth is 
about 2 meters. The conductivity and 
heat capacity are modelled as functions 
of soil moisture and soil type with 
the model of Ref. 4. 

To solve these equations for 0, T s , 
and T, boundary conditions must be 
supplied for moisture and temperature 
both at the air/soil interface and in 
the bottom layer of the profile. In 
principle, either the fluxes q Q and 
G or the variables 0 and T could be 
specified. In the model, surface heat 
and moisture fluxes are computed to 
model the effects of the environment 
(i.e., rainfall, evapotranspi ration, 
radiation, etc.) on the profile 
evolution. At the bottom of the 
moisture profile a choice of flux 
or moisture boundary condition is 
used. One can specify constant 
moisture, a downward moisture flux 
equal to the hydraulic constant of the 
bottom layer, or any constant value. 

The energy balance equation provides 
the surface fluxes: 

G = R + LE + H (5) 

All fluxes are positive downward. G is 
the heat absorbed by the soil, R is 
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the net radiation flux, LE is the 
evapotranspi rati on energy flux, and 
H is the sensible heat. After finding 
the solution, the surface moisture 
flux q 6 is set equal to E and 6 is 
used in the force-restore equations. 

The net radiation R is divided into 
average short- and long-wavelength 
components: 


The sensible heat flux H in the con- 
tinuity equation (21) is calculated by 

H = — y C ] XT a ( T s - T a ) (9) 

The terms of heat balance equations are 
functions of the known surface tempera- 
ture's and the meteorological variables 
e a , U a , and T a . 


R = R short + R long 


( 6 ) 


METHOD OF SOLUTION 


Either or both components may be 
either estimated or measured. All 
four options are allowable within the 
computer program, standard models such 
as the Brunt model for long-wave 
radiation being available as options 
to estimate either or both components. 

A standard model for the latent heat 
flux under neutral atmospheric stability 
is (Ref. 5) 


LE = 


pc p k 2 U a 


Y ln 2 (z/z n ) 


■(e s - e a ) (7) 


where p is the density of air (1.15 x 
10" J g cm"^), c„ the air specific heat, 
k the von Karman constant (0.£), z 0 the 
surface roughness parameter, U a the 
wind velocity (centimeters per second) 
at height z averaged over a suitable 
time period (~ 1 hour), e a the vapor 
pressure at height z, e s the vapor 
pressure at the soil surface, and y 


The continuity equations (1 and 4) are 
solved by expressing the spatial deriva- 
tives of the moisture fluxes as finite 
differences, and then using a fourth order 
predictor-corrector for the time integra- 
tion. 

The soil is divided into N layers of 
varying thicknesses Az-j , where N and 
Az-j are input variables. At a parti- 
cular time the moisture fluxes at the 
N-l interior boundaries are calculated 
by evaluating Eq. 2. The surface and 
bottom fluxes are computed by evaluating 
the boundary condition equations at 
the top and bottom boundaries. This 
gives the flux at the N+l layer boundar- 
ies, and the derivative with respect 
to depth is approximated for the ith 
layer by 


8c >m 

8 Z 


q m i+ i - q mi 


AZi 


( 10 ) 


the psychrometric constant 
K-l). 

This may be expressed as 
LE = -C]IT a (e s - e a ) 

(0.61808 mbar 

The continuity equations 

(1 and 4) are 

(8a) 

of the form 
+ 

3y + 

= f (t,y) 

(ID 

pc k 2 

where C] = 

(8b) 

at 


y ln 2 (z/z 0 ) 


where the state vector y 

has N+2 


Input to the program i_ncludes the 
constant C] and both U a and e a as 
functions of time. 


of N layers, T s and T. The first N 
elements of f are Eq. 10, and f n +| 
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and f n +2 are the right hand sides of 
the force restore equations, 4a and 4b. 


■►(c) 

y (t+At) = y(t) + 
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This is readily solved by an Adams- 
Bashforth predictor-corrector algorithm 
(Ref. 1,9). We first introduce the 
backward difference operator v such 
that 

+ ■> 
vf (t ) = f(t) - f(t - vt) 

Higher order backward differences are 
evaluated by successive applications 
of the operator; 

v 2 f(t) = v(vf(t)) = f(t) - 
2f(t-At) - f(t-2At) 


V 3 f(t) = f(t) -3f(t-At) + 
3f(t-2At) -f(t-3At) 

V^f(t) = f(t) -4f (t-At) + 

6f(t-2At) -4f(t-3At) + f(t-4At) 


Using only the state vector y and the 

physical model f , at time t, an 
estimate (called the predictor) at 
time t+At is 

- (P) - 1 

y (t+At) = y(t) + At[l + — V + 

2 

5 3 251 

V 2 + — V 3 + V 4 ]f(t) 

12 8 720 

+ (P) 

Then, using y (t+At) to evaluate 

the model f(t+At) at the t+At, 

one may compute another estimate of the 

state vector called the corrector; 


At f (t+At ) 

720 

At + 

+ [469 + 109 V + 49 V 2 +19 V 3 ]f(t) 

720 

(13) 

The physical model equations are 
evaluated only once each time step 
•¥ 

to calculate f(t+At). The difference 
between 

* (P) - (c) 

y and y is a reliable estimate 
of the error, and the software deter- 
mines if each element of this 
difference lies within a user specified 
window. If all differences are smaller 
than this window, the integration step 
size (At) is doubled, leading to 
increased computational efficiency. 

If any difference is too large, the 
step size is halved. This halving and 
doubling requires no re-evaluation of 
the model equations. The values of the 
four backward differences for the new 
integrator time (whether for halving, 
doubling, integration) are calculated 
as linear combinations of the four 
back values for the old integrator 
time. The fact that the same calcu- 
lations are done on many pieces of 
data at the same time as well as the 
boolean nature of the error window 
checks make the Hillslope Model an 
ideal application for a Single 
Instruction Multiple Data (SIMD) 
Massively Parallel Processor (MPP). 

Mapping the Hillslope Model to the 
MPP Architecture 

The initial mapping the Hillslope 
model to the MPP involved three 
phases: 

1. Determination of scalar and paral- 
lel components of the calculations 

2. Data initialization on the MPP 

3. Data output to the VAX 

Both calculations within the layers 
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of the hi 1 1 s 1 ope and at the boundaries 
are required at each time step of the 
model integrations. The calculations 
within the hillslope are exactly the 
same at each layer, so they were set 
up so that they could be done in unison 
by means of parallel arrays in the MPP 
Array Unit. These parallel values 
consisted of the soil moisture, temp- 
erature, depth, layer thicknesses, 
fluxes, derivatives, backward differ- 
ences, as well as predicted and 
corrected values. Because the model 
was set up with a view to extension 
to a two dimensional model, each row 
in the Array Unit was to represent 
one vertical column in the hillslope, 
with the components of the row repre- 
senting the layers in it. This initial 
version of the model was set up with 
14 layers in the soil column to match 
the serial calculations. Thus in this 
one-dimensional rendition, only part 
of the first row of the Array Unit 
was used for each parallel array of 
data. In addition, since the tempera- 
ture and moisture values constitute 

the state vector (y) and thus require 
the same type of computations, they 
were placed together in the first 
row of the array unit. While this is 
initially wasteful in terms of the 
computational power of the MPP, it 
allows an easy extension to a two 
dimensional model which will use the 
other rows and hence the full cap- 
abilities of the MPP. 

The boundary conditions involve only 
single values at the top and bottom 
of the soil column modified with 
scalar input data, and so they may 
be more efficiently done serially with 
the Main Control Unit (MCU). Thus, 
implementing the model on the MPP 
involved scalar and parallel compo- 
nents as well as communication between 
the scalar and parallel components 
for boundary condition values. The 
MPP architecture is very efficient 
at passing scalar values back and 
forth between the MCU and the Array 
Unit as it has a register designed 


for this purpose. Two special purpose 
assembly routines were written to take 
advantage of this. One took a user 
specified row and column in a parallel 
array and placed a given scalar value 
there. Another took a user specified 
row and column in a parallel array and 
retrieved a value into a scalar in 
the MCU. These were used in the 
boundary condition calculations. 

Data intialization on the MPP therefore 
involved initialization of parallel 
arrays as well as scalar data. These 
arrays were initialized in FORTRAN 
arrays and scalars on the VAX and 
transferred over to the MPP scalars 
and arrays via the DR780 and DRllb 
buffers. The capability of the stager 
to permute the data bits between the 
VAX and the Array Unit was used to 
change the format of the floating point 
data from the VAX format to the MPP 
format while the data was being trans- 
ferred between the VAX and the MPP. 
Because of this, parallel data trans- 
mission between the VAX and the MPP 
appeared transparent. Explicit bit 
swapping of scalar integer data between 
the VAX and MCU to accomodate the two 
separate integer formats was still 
necessary, however, as these values 
were transmitted across the DRllb 
which has no data permuting capa- 
bility. 

Special purpose routines were written 
to enable data output to the VAX of 
the information in the parallel arrays. 
These routines passed parallel arrays 
into predefined VAX FORTRAN arrays. 

MPP Pascal callable VAX FORTRAN 
routines were written which could write 
out the data in these arrays for user 
examination of intermediate and final 
results. In addition, some high level 
I/O equivalent to Pascal 'writeln' 
was written to speed the debugging 
process by bypassing the CAD debugger 
enti rely. 

The next step in mapping the Hillslope 
model to the MPP architecture involved 
implementing the physical model 
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equations in MPP Pascal. 

The predicted and corrected state 
vector values (soil moisture and 
temperature) involve a calculation 
of the type (see Eqs. 12 and 13) 

-* 

y = y + At £ W * tp 
n+1 n i i i 

where 

W are the backward differences 
i for each layer 


tp are the constant scalar coeffi- 
i cients in the predictor equation 


By assigning the backward differences to 
individual parallel arrays so that order 
ith backward differences for each layer 
are stored in the first row of the ith 
parallel array, the above calculation 
can be solved for each layer with a 
series of parallel operations. 

W * (tp * At) involves only multi- 
i i 

plication of a parallel array by a 
scalar as the coefficients of the 
predictor equation would be the same 
for each layer with the same order 
backward differences. The remaining 
sums can be done for each layer in 
parallel. Once the predicted and 
corrected values of each layer are 
available, their differences can be 
simultaneously calculated. Compari- 
sons with user input error windows 
can be done all at once with simple 
boolean tests on the parallel arrays. 
Recalculation of the backward 
differences for each layer can be done 
in unison with the backward difference 
arrays. 

There are additional calculations 
needed at each layer and each time 
step in order to include the depend- 
encies of the heat and moisture 
fluxes on the moisture and tempera- 
ture profiles through the column. 


These involve the matric potential, 
the hydraulic conductivity, the 
moisture fluxes as well as the deriva- 
tives of the moisture fluxes and 
temperatures. Special calculations 
must be done at the boundaries in 
order to include the effects of the 
air/soil interface at the top as well 
as the effects of saturated soil at 
the bottom of the column. 

The matric potential and hydraulic 
conductivity calculations involve only 
parallel computations. They both 
depend on the calculation of the type: 

result := a * A 15 

where 'a' and ' b ' are scalars and 'A' 
is a parallel array (which is the same 
for both). A parallel version of this 
was calculated by: 

temp := ln( A ) 

result := a * exp( b * temp ) 

where 'temp' is a temporary variable. 

The savings in time by only calculat- 
ing the natural logarithm once and 
doing the calculations in parallel 
will be considerable as the model is 
extended to more dimensions. The 
derivatives are also easily available 
through parallel operations as the 
change between layers can be obtained 
easily through the 'shift' operator 
and the thickness of each layer is 
stored in a parallel array. The same 
is true of the fluxes. Computation of 
thermal parameters involved mixing some 
scalar values with information from 
various points in parallel arrays and 
storing the results back in the paral- 
lel arrays. Here is where the special 
purpose routines were used. Simple 
'get' and 'put' routines were written 
to get/put a value out of/into a user 
specified row and column in a parallel 
array. This could be done quickly 
using the special capabilities of the 
MPP architecture. Moreover, as the 
needed scalar computations were being 
done, parallel operations could be 
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performed concurrently in the Array 
Unit. 

Minimizing the Program Execution Time 
by Using the Capabilities of the MPP 
Through MPP Pascal 

The multiprocessing capabilities of 
the MPP are easily available through 
MPP Pascal. Scalar calculations are 
performed in the Main Control Unit. 

This is a special purpose microcoded 
16-bit processor which has a 16 bit 
hardware multiplier. Parallel calcu- 
ations are performed in the Array Unit 
with 16384 specially designed pro- 
cessors. The scalar Main Control Unit 
calculations and the parallel Array 
Unit calculations are done simul- 
taneously except when the Main Control 
unit is expecting a scalar result from 
the Array Unit. This would be the 
case, for example, when doing a maximum, 
minimum or sum operation on a parallel 
array. MPP Pascal produces a code 
which runs in the Main Control Unit 
and makes calls to library and special 
purpose routines which run in the 
Array Unit. There is also a call queue 
which enables the Main Control Unit 
to stack its calls (including register 
transfer data) to the Array Unit. These 
calls may be stacked up to 15 deep. 

Thus a parallel operation, such as an 
assign, in MPP Pascal translates to a 
single call by the Main Control Unit 
to the Array Unit to begin its process- 
ing with its own processors. The Main 
Control Unit is then free to do either 
scalar calculations or send another 
parallel operation request to the 
Array Unit. By recognizing that the 
Main Control Unit is a serial processor 
it becomes apparent that sending 
requests to the Array Unit to perform 
parallel operations and then doing 
scalar operations in the Main Control 
Unit allows the scalar and parallel 
calculations to be done at the same 
time. 

This feature of the MPP was used 
extensively in the boundary condition 
calculations to reduce program 


execution time. It proved to be the 
single most important tool for reduc- 
tion of program running time. Other 
techniques involved setting up masks 
at initialization time and reusing 
them instead of regenerating them with 
'WHERE' statements, and also the use 
of temporary stores for the results of 
natural logarithm and exponent functions 
which were to be used in more than one 
calculation. 

A 24 hour simulation on the MPP with 
14 layers used 30 seconds of computer 
time, whereas the identical simulation 
on the IBM 3081 serial machine used 4 
seconds. Adding more layers to the MPP 
model would use virtually the same 
amount of time, whereas the execution 
time on the serial machine is 
approximately linear with the number of 
layers. We expect, then, that the 
break even point is approximately 115 
layers. Modelling an entire catchment 
could easily require ten times this 
number, so we expect that the parallel 
architecture of the MPP will provide 
significant savings in computer time 
over a similar model on a serial 
machine. 


CONCLUSION 

We have coded a one-dimensional 
hydrological model of the surface 
energy and moisture balance and 
moisture flow in the unsaturated 
zone, as a precursor to a complete 
catchment model . 

By comparing to an identical model 
on an IBM 3081 serial machine, we 
have shown that it is feasible to use 
the MPP for numerical models such as 
this one, and that the parallel arch- 
itecture makes such calculations more 
efficient when the physical model 
includes modelling the same processes 
at many different points in space. 
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